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We show that there exist bipartite quantum states which contain large hidden classical corre- 
lation that can be unlocked by a disproportionately small amount of classical communication. In 
particular, there are (2n + l)-qubit states for which a one bit message doubles the optimal classi- 
cal mutual information between measurement results on the subsystems, from n/2 bits to n bits. 
States exhibiting this behavior need not be entangled. We study the range of states exhibiting this 
phenomenon and bound its magnitude. 



The study of possible correlations between quantum 
systems was initiated by Einstein, Podolsky and Rosen 
PJ and Schrodinger 0. These pioneers were concerned 
with entanglement — quantum correlation that are non- 
existent in classical physics. Recent development in 
quantum information theory has motivated extensive 
study of entanglement (see |3| for a review). Further- 
more, an exciting subject of characterizing other inter- 
esting types of correlations has emerged. For example, 
quantum correlation, classical one, or quantum and clas- 
sical correlation have been studied 0, H, IE 13 • 

The classical mutual information of a quantum state 
Pab can be defined naturally || as the maximum clas- 
sical mutual information that can be obtained by local 
measurements Ma <B> Mb on the state pab'- 



IAp)= max I(A:B). 

M A ® Mb 



(1) 



Here I(A : B) is the classical mutual information defined 
as I(A:B) = H(pa)+H(pb)—H(pab), H is the entropy 
function [9j, and pab,Pa,Pb are the probability distri- 
butions of the joint and individual outcomes of perform- 
ing the local measurement Ma® Mb on p. The physical 
relevance of I c is many-fold. First, I c (p) is the maxi- 
mum classical correlation obtainable from p by purely 
local processing. Second, I c {p) corresponds to the classi- 
cal definition when p is "classical," i.e., diagonal in some 
local product basis and corresponds to a classical distri- 
bution. Third, when p is pure, I c {p) is the correlation 
defined by the Schmidt basis and thus equal to the en- 
tanglement of the pure sta te Jiol Hl| . Finally I c {p) = 
if and only if p — pa® Pb |l2| . 

Any good correlation measure should satisfy certain 
axiomatic properties. First, correlation is a nonlocal 
property and should not increase under local processing 
(monotonicity) (I). Second, a protocol starting from an 
uncorrelated initial state and using I qubits or 21 classi- 
cal bits of communication (one-way or two-way) and local 
operations should not create more than 21 bits of corre- 
lation. We call this property total proportionality (II). 
The intuition is that if 21 bits of correlation can be es- 



tablished with fewer than 21 bits of communication, then 
it may be possible to establish nonzero correlation with 
no communication if the receiver guesses the message. 

We may expect other properties for any correlation 
measure. If a protocol has several rounds of communica- 
tion, one may consider the increase of correlation due 
to each round of communication. Intuitively, a small 
amount of communication should not increase correla- 
tion abruptly. In particular, one may expect that the 
transmission of I qubits or 21 bits should not increase the 
correlation of any initial state by more than 21 bits. We 
call this property incremental proportionality (III) . This 
strengthens total proportionality by allowing all possible 
initial states, or equivalently by considering the increase 
in correlation step-wise. Other properties such as conti- 
nuity in p are also expected (IV). 

All of these properties (I- IV) hold for some well known 
correlation measures. They hold for the classical mutual 
information I{A : B) when communication is classical 0] 
as one may expect. They also hold for the quantum mu- 
tual information I q (p) [8j (for any communication) . Here 
I q {p) = S{p A )+S(p B )-S(p) withS(p) = -Tip log p being 
the von Neumann entropy and pa = Trg p, Pb = Tr^ p. 
In Ref. monotonicity, total proportionality, and con- 
tinuity have been proved for I c , while incremental pro- 
portionality was only proved for pure initial state p (for 
any communication) and for the classical restriction. 

In this paper, we report the surprising fact that incre- 
mental proportionality for I c can be violated in some ex- 
treme manner for a mixed initial state p. We will see that 
a single classical bit, sent from Alice to Bob, can result in 
an arbitrarily large increase in I c . This phenomenon can 
be viewed as a way of locking classical correlation in the 
quantum state p. If one-bit of communication increases 
I c by a large amount, the correlation must be "present" 
initially, though hidden or locked as indicated by a small 
initial value of I c . Only after the one-bit transmission can 
the large amount of correlation become accessible or un- 
locked. Since incremental proportionality of I c holds clas- 
sically, the phenomenon of locked correlation is a purely 



2 



quantum effect. It is a direct consequence of the indis- 
tinguishability of non-orthogonal quantum states. Appli- 
cations of such indistinguishability are well known, most 
notably in quantum key distribution 

HI 

and the various 

partial quantum bit commitment and coin tossing proto- 
cols (see 0, and references therein) . Curiously, the 
simple effect that we observe and bound in this paper 
had not been noted before. 

For a given initial state p and the amount and type of 
communication, we can capture the increase in correla- 
tion by defining the following functions: 

/«(/>)= max/ c (A«(p)), 4^)=max/ c (A[<](p)). (2) 

AO AW 

The operator A denotes a bipartite quantum operation 
consists of local operations and no more than I bits or 
qubits of communication, a constraint denoted by the 
superscript (I) or [I]. Note that I c (p) = li°\p) = lf\p). 
Throughout the paper, we use p and p' to denote the 
states before and after the quantum operation with com- 
munication, p' — A(p). 

With this notation, we summarize our main results: 

• We present an example in which 1 bit of classical com- 
munication increases I c by ^logd bits, where p consists 
of 1 + logd and logd qubits in Alice and Bob's systems 
respectively. Since I c satisfies total proportionality, the 
classical correlation can be viewed as being locked in the 
state p and then unlocked in p' by the 1-bit message. 

• We bound the extent of incremental proportionality vi- 
olation in terms of the amount of initial correlation and 
the amount of communication. The amount of correla- 
tion unlocked by / bits of 1-way classical communication 
can be bounded as (Theorem 1) 



I«(p)-/c(p) < l + (2 l -l)I c (p) 



(3) 



For small I c {p), the amount unlocked by I qubits (two- 
way) can be bounded as (Theorem 2) 



For example, the state p can arise from the following 
scenario. Let d = 2". Alice picks a random n-bit string 
k and sends Bob \k) or H® n \k) depending on whether 
the random bit t = or 1. Here H is the Hadamard 
transform. Alice can send t to Bob to unlock the cor- 
relation later. Experimentally, Hadamard transform and 
measurement on single qubits are sufficient to prepare 
the state p and later extract the unlocked correlation in 
p' - they can be realized using photons and linear optical 
elements like quarter-wave plates and calcite crystals. 

Now we prove that I c {p) — \ logd. First, the complete 
measurement Ma along {|fe)<g>|i)} is provably optimal for 
Alice: Since the outcome tells her precisely which pure 
state from the ensemble she has, she can apply classical, 
local post-processing to obtain the output distribution for 
any other measurement she could have performed. For 
Alice's choice of optimal measurement, I c {p) is simply 
Bob's accessible information / acc flfl ] about the uniform 
ensemble of states {|fc), Ui\k}}k=o,---,d-i- 

In general, the accessible information 7 acc about an en- 
semble of states £ = {pi > 0,rji} is the maximum mutual 
information between i and the outcome of a measure- 
ment. / a cc(£) can be maximized by a POVM with rank 
1 elements only Let M = {otj\4>j){ ( t>j\)j stand for a 
POVM with rank 1 elements where each \<f>j) is normal- 
ized and ay > 0. Then I acc {£) can be expressed as 



4cc(£) = max -V" pi log pi 

M L ' — ' 

i 

+ ^^PiUjifalVilfa) log 



(6) 



Pi{<t>iW\<t>j, 



where p = J2iPiVi- 

We now apply Eq. @ to the present problem. Our 
ensemble is {i, U t \k)}k,t with i = k,t, p k ,t = p, P = 3> 
and {(j)j\p\(f)j) = Putting all these in Eq. JSJ, 



lf{p)-I c {p) < 2l + 0(d 2 ^I c JpjlogI c (p)). (4) Ic(p) = n mx[log2d+V%^|[/ 4 |fc}| 2 log- 

f<p nr>w HperriViP fhp PYnmnlp in whirh an n.rVii+.Tfl.rv 3^ 



\Ut\ 



We now describe the example in which an arbitrary 
amount of correlation is unlocked with a one-bit message. 
The initial state p is shared between subsystems held by 
Alice and Bob, with respective dimensions 2d and d, 



max 

M 



logd+^^^K^i^i^piogK^-i^i 



kl 



d-1 1 



p = Yd S ^^ )a ® www})* 



(•5) 



fc=0 t=o 



Here Uq — I and U% changes the computational basis 
to a conjugate basis (V^fc |(i|J7i|fc)| = -^). In this ex- 
ample, Bob is given a random draw \k) from d states 
in two possible random bases (depending on t = or 
1), while Alice has complete knowledge of his state. To 
achieve Ic (p) — logd + 1, Alice sends t to Bob, who 
then undoes Ut on his state and measures k in the com- 
putational basis. Alice and Bob now share both k and t, 
with loge? + 1 bits of correlation. 



where we use J^j a j = ^ an d Vjt J2k I (<A? 1^*1^)1 2 = 1 to 
obtain the last line. Since J^j = 1> the second term 
is a convex combination, and can be upper bounded by 
maximization over just one term: 



Ic (P) < log d + max I V | (0 1 U t | k) | 2 log | (0| U t \ k) \ 2 

10) ^ 



(7) 



kl 



Note that -J2u \(<t>\Ut\k)\ 2 log\(^\U t \k)\ 2 is the sum of 
the entropies of measuring \(f>) in the computational basis 
and the conjugate basis. Reference proves that such a 
sum of entropies is at least log d. Lower bounds of these 
type are called entropic uncertainty inequalities, which 
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quantify how much a vector \(f>) cannot be simultaneously 
aligned with states from two conjugated bases. It follows 
that I c (p) < 5 logd. Equality can in fact be attained 
when Bob measures in the computational basis, so that 
I c (p) = \ logd and I?\p) - I c {p) = 1 + i logd. 

We remark that incremental proportionality remains 
violated for multiple copies of p. Wootters proved 
that |l8j the accessible information from m independent 
draws of an ensemble £ of separable states is additive, 
Iacc(£® m ) = ml acc (£). It follows I c (p® m ) = ml c (p) in 
our example. 

One would expect a stronger locking effect when the 
message (a key) is longer than one bit. There are two 
figures of merit: First, the "amplification" of correla- 
tion, n = I c (p')/I c (p), should be large. Second, the 
amount of unlocked information, compared to the key 
size, r 2 — (Ic(p') — I c (p))/l, should be large. Ideally, we 
want both r\ and r 2 to be arbitrarily large. We have 
investigated (see the Appendix for details) this possibil- 
ity by generalizing our 2-bases example to L > 2 conju- 
gate (or mutually unbiased) bases. The key size is then 
/ = logL. We have found rigorous results for the two 
extreme cases, namely the previous example with L = 2 
in which (r 1 ,r 2 ) ~ (2, logd) and the case of L = d+1 
bases in which (ri,r 2 ) ~ (2 logd, 2). We believe some 
intermediate values of L will make both r\,r% large. For 
example, any logL = o(logd) will guarantee that r\ is 
large. But an analytic proof that r 2 is also large has 
proved to be difficult, and numerical studies are incon- 
clusive (see Appendix). 

An even stronger kind of locking would be what we call 
complete locking, in which I c {p) would decrease rapidly 
with the key size I, yet the key can retrieve a finite frac- 
tion of the data. For example, 

I c {p) ct2- al and I c (p') - I « 6 logd. (8) 

where p is supported on two d-dimensional systems, 6 > 
is independent of d, I, and a > 0. Note that r±, r 2 are 
automatically large for large d in complete locking. We 
find that for large d complete locking cannot occur with 
a > 1 or for very short keys I = o(loglogd). This follows 
from the following Theorem: 

Theorem 1 If p' is obtained from p with / bits of 1- 
way classical communication, I c (p) > 2~ i (/ c (p') — I). It 
follows jj (p) - I c (p) <l + (2 l - l)I c (p). 

The intuition behind the proof is that Bob can just guess 
the classical key. If he guesses correctly (with probabil- 
ity ^r), he gains I c (p') bits of information, so that the 
average information gain is at least ^rl c {p r )- 
Proof: Let p' results from sending an Z-bit message (or 
key) from Alice to Bob. Let the random variable Z de- 
scribe the key, and the random variable X be the outcome 
of Alice's POVM measurement that optimizes I c {p')- We 
can always include Z as part of X. Bob applies one of 
2 l possible measurements based on a random variable Z, 



yielding the outcome Y. To achieve I c (p'), Bob takes 
Z = Z , and each of his measurements is optimal for each 
value of Z. Therefore 0: 

I c (p') = I{X:YZZ\Z = Z)= I{X:YZ\Z = Z) . (9) 

Applying the chain rule 0: 

I(X:YZ\Z = Z) = I(X:Y\Z,Z = Z)+I(X:Z\Z = Z) 

< I(X:Y\Z,Z = Z) + 1 (10) 

where we have used I(X : Z\Z = Z) < I because I is the 
size of the key Z. 

Working from the other end, consider the follow- 
ing not necessarily optimal measurement on p: Alice's 
measurement is same as before, but Z is not sent to 
Bob. Instead, Bob draws Z at random. The re- 
sulting mutual information provides a lower bound on 
I c (p), Ic(p) > I(X :YZ). By the chain rule, we can write 
I(X:YZ) =I(X:Y\Z)+I(X:Z) =I{X:Y\Z). Be- 
cause Z is independent of X we have 

I c {p)>I{X:Y\Z). (11) 

Because Z is part of A, we can write 

I(X:Y\Z) = I(XZ:Y\Z) 
= I(X:Y\ZZ) + I(Z:Y\Z) > I(X:Y\ZZ) , (12) 

again using the chain rule and I(Z:Y\Z) > 0. 
Now, comparing IjlQI) and (|12l> . 

I(X:Y\Z,Z = Z)=Y^P?{Z = z Q )I{X:Y\Z = Z = z ),{13) 
I(X:Y\ ZZ) =E Pr(S = ^(*^ = *o,* = *i) . (14) 

The sum l)14|) is the same sum as (|13|) but with some 
extra terms and a factor of 1/2 Z , so 

I(X:Y\ZZ)>^I(X:Y\Z,Z = Z), (15) 

and putting together (|15JI and 1)9110111(1 proves the first 
statement. The second statement is true because there 
is only one round of communication; monotonicity then 
implies the optimal A^ in J2| consists of just the com- 
munication. □ 

We can bound the violation of incremental proportion- 
ality in yet another way. Total proportionality for I c 
(when I c {p) — 0, transmitting I qubits can increase I c 
by at most I bits) can be restated as il I c (p) = 0" implies 
no incremental proportionality violation. We may thus 
expect a small violation of incremental proportionality 
when I c {p) is small. We are able to prove the following: 

Theorem 2 Let p be a bipartite state on C d ® C d and 
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p' be obtained from p by I qubits of two-way communi- 
cation. If I c {p) < B ^ a( ^ 1)it , 

I c (p') - I c (p) 

<2l- (2d)V(21n2)/ c (p) lo gA /(2 In 2) J c (p) . 

The proof of theorem 2 relics essentially on the following 
lemma (see the Appendix for a proof) which says that 
when I c {p) is small, p must be close to an uncorrelated 
state (in trace distance). 

Lemma 1 If p is a bipartite state on C d (g> C d , then 

Tt\ Pab ~ Pa ®Pb\ < (2d)V2hi2 J~^j } ( 16 ) 
where p A/£ > = Tr B/A p. 

The theorem can be proved by first relating I c to I q which 
obeys incremental proportionality (with an extra factor 
of 2). Then Lemma 1 and the continuity of I q implies 
I q {p) is close to I q (pA ® Pb), giving the desired bound 
(see the Appendix for details). 

The weakness of Lemma 1 and thus that of Theorem 
2 stems from the factor d 2 in Lemma 1. This factor 
comes from an analysis that uses measurements in all 
mutually unbiased bases to distinguish pa® Pb from p, 
and the analysis is probably not optimal. Note that the 
dependence on the dimension d in the bound in Theorem 
2 makes it impossible to completely rule out complete 
locking. 

Our locking scheme is closely related to quantum key 
distribution (QKD), in particular BB84 14], in which 
Alice holds a basis bit (computational or Hadamard) for 
each of Bob's qubits. Transmitting the locked state lim- 
its the classical correlation between Alice and any po- 
tential eavesdropper (Eve) and forbids her from tamper- 
ing without disturbance. Announcing the basis bits at a 
later stage enables Alice and Bob to unlock the correla- 
tion. Furthermore, incomplete unlocked correlation (as 
indicated by the test bits) reveals Eve's tampering. How- 
ever, in BB84, one bit is sent for every bit to be unlocked, 
and there is no extreme unlocking behavior as shown by 
our examples. 

Further research into the phenomenon of locking will 
be worthwhile. For instance, we have seen differences in 
the locking effect by quantum and classical keys. An- 
other important factor affecting the strength of locking 
is the number of rounds of communication allowed. In 
fact, a striking difference between one-way and two-way 
communications can be seen if one generalize the state in 
Eq. JSJ so each of Alice and Bob has a one-bit key reg- 
ister, and the rotation U t , now performed on both Bob's 
and Alice's state, is determined by the parity of the two 
key bits. Full unlocking is possible with two-way commu- 
nication, but not with one-way communication. Finally, 
the possibility of complete locking, or the impossibility 
(by improving Lemma 1 and Theorem 2) are important 



open questions; it may be interesting to see how com- 
plete locking relates to known restrictions on partial bit 
commitments |lfj| . 
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APPENDIX 
Locking with more bases 

Intuitively, we expect a larger key to exert a stronger 
locking effect (i.e., give a larger value of Ic (p) — Ic(p))- 
For instance, we have seen how classical mutual informa- 
tion can be locked by encoding in one of two bases. A 
natural question is, can we lock more information by en- 
coding in L > 2 bases? A convenient choice of such bases 
are the mutually conjugate or mutually unbiased bases, 
with the defining property that the inner product be- 
tween any two states from two different bases has magni- 
tude ^4= in a cZ-dimensional system. It is known that one 
can have at most d + 1 mutually conjugate bases in d di- 
mensions, and this maximum number of bases exists and 
can be constructed when d is a prime power [20I Elf . Let 
Ui, - ■ ■ ,Ud take the computation basis to each of these 
conjugate basis and Uo = I. In a scheme using L bases 
(with key size I = logL), 

p = Td £ X> )(fc| ® !*>(*!)* ® (u t \k)(k\u}) B . (17) 

k=l t=o 

When Alice tells Bob which basis t, the resulting state p' 
again has I c {p') — log L + log d = I + log d. Applying the 
same analysis as before, 

Ic (p) < log d + max i V I (0| U t I k) \ 2 log | <0| U t \ k) \ 2 . (18) 
l0> L kt 

When L = 2, I c (p) = ^logd. Thus one would hope 
I c (p) = j- logd in general. Unfortunately, the crucial en- 
tropic inequality in Ref. 01 does not provide the desired 
bound. Extensive numerical work on primes 3 < d < 29 
and 2 < L < d+ 1 shows that I c (p) ~ (x + c ) l°g^ where 
c is roughly 0.1 — 0.15 for the values of d investigated. 

In the extreme case of L — d + 1 , we can apply an- 
other entropic inequality [22^ namely that the sum of the 
entropies is at least (d + 1) log (^-) , so that 

I c (p) < logd - log(d + 1) + 1 = 1 - log (1 + i ) and 
h{ff)-I e {p) >21og(d+l)-l 

This still unlocks w log d bits, though the amount is com- 
parable to the log(d+ 1) bits communicated and thus we 
have no (strong) violation of incremental proportionality 
in this regime. 

Small initial correlation 

In Theorem 1, the difference between I c (p) and I c (p') 
is bounded by the product of the initial correlation and 
the number of different messages that can be sent. Here, 
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we bound the violation by a function of the initial corre- 
lation only, allowing an arbitrary number of qubits com- 
municated interactively. More formally, 

Theorem 2 Let p be a bipartite state on C d ® C d and 
p' be obtained from p by I qubits of two-way communi- 
cation. Let d! < 2d be the least prime power no less than 
d, and r](x) = -xlogx. If I c (p) < ((f/ ^ l)a , 

I c (p>) - I c {p) < 

21 + 2(d'+l)\/2I c (p) In 2 logd+ ry ((d'+l) V 2i c(p) In 2) . 

A simpler, but less tight expression can be obtained from 
the above by expanding the log function in r\: 

I c (p')-I c (p) 

<2l- (d' +1)7(2 hi 2)/ c (p) logV(21n2)7 c (p). 

Thus even for the most general communication model, 
incremental proportionality violation is continuous in 
the initial correlations, with incremental proportionality 
holding when the initial state is uncorrelated (a special 
case of pure initial states). However the present bound 
is not uniform with respect to the size of the support 
of p: To get h(p') < 21 + S, we need approximately 
h{p) < <5 2 (2d)- 4 . 

Proof: The theorem can be proved by putting together 
various properties of h(p) and I q (p), and the main steps 
of the proof can be summarized as: 

h(p') - I c (p) < I c (p') < I q (p r ) <2l + I q (p) 

3 

< 21 + logd 2 Tr\p A ®p B -p AB \ + rj(Tr\p A ®p B -p AB \) 

< 21 + 2(logd)(d'+l) V 2/ C (P) ln2 + 7?((£T+1) V2/cG») ln2) 

First, we explain the intuition behind the properties 
that make each step valid, then, we complete the proof 
by proving each of the steps. The idea is to upper bound 
h by I q and use incremental proportionality of I q || in 
steps 1 and 2. Then it remains to show that I q is small 
if h is small, and this is done in steps 3 and 4. Step 
3 expresses I q as a difference of the entropies S(p) and 
S{pA ® /Qb), which is subsequently bounded by Fannes' 
inequality |23j . Step 4 is to prove and apply the following 
lemma: 

Lemma 1 If p is a bipartite state on C ® C d , then 

Tr | pab — Pa ® Pb \ < [d! + if ^ 2\n2 I c {p) , (19) 
where d! < 2d is a prime power no less than d. 

This lemma says that a state with small classical mutu- 
ally information is close to being a product state, and 
a simple consequence is that h(p) = iff p is a product 
state. Steps 3 and 4 give the desired bound of I q in terms 
of/ c : If I c < 6 ^ 2(d ,+ 1)a then 

I q < 2 (d' + l) V^cM In 2 log d+ V ((d' + l)y2I c (p) In 2) . 



We proceed to prove steps 1, 3, and 4. First, h and I q 
can be rewritten as |24j : 

I q {p) = S( P ab\\pa®pb), (20) 

hip) = max Sip A B\\PA ®Pb) , (21) 

M A ® M B 

where Pab-,Pa,Pb are the probability distributions of the 
joint and individual outcomes of applying a local mea- 
surement Ma <8> Mb to p, and the quantum relative en- 
tropy is defined as 

SHI/i) := Tr(z/logi/) - Tr(z/log/i) . (22) 

To prove step 1, let Ma and Mb be the optimal mea- 
surements for hip)- Let A be the local quantum opera- 
tion of applying Ma <8> Mb followed by storing the classi- 
cal outcomes in ancillas A' and B' and discarding of the 
original systems A and B. The final state pa'B' = A(p) 
is a classical state corresponding to pab so that 

I c {p) = SipAB\\PA^>PB) 

= SipA'B'WpA' ® PB') 

< Sip AB \\pA®PB) = hip), (23) 

where the inequality in Eq. I|23|l is due to monotonicity 
of I q {p) under the local operation A. 

To prove step 3, recall Fannes' inequality for do- 
dimensional states v, p with Tr|^ — p\ < 1/e, 

\S(v)-S(p)\ < logdoTr|^ - p\ + r7(Tr|^ - p\) 

Hence, if Tr\p A <S>PB ~ Pab\ < 1/e, 

Iqip) = Sip A ®p B )-SipAB) 

< 2 log dTr | p A ® Pb- PAB +i](Tr\pA®PB-pAB\)- (24) 

Once Lemma ^ is proved, step 4 can be obtained by 
substituting Eq. JUJ into Eq. (JHJ). We prove Lemma 
H for p on C d <S> C d where d = p n is a prime power. 
The general case follows, because when d is not a prime 
power, p can still be taken as a state on C d ® C d where 
d' < 2d is the least prime power no less than d. The 
main idea in the proof is to rewrite Tt\pab~ Pa ® Pb\ as 
a sum, each term of which is bounded by the initial classi- 
cal mutual information of p. This relies on the following 
result proved for d — p n [2H l2lj . There exists a ba- 
sis {Ml}k=i : ...,d+i,i=i,--,d-i for traceless d x d matrices, 
such that Tr M^Mf = dSkiSij, i.e. orthonormal under 
the trace norm up to a scaling factor. Furthermore, for 
each fc, {M^.}i = i,... i( i-i is a commuting set, and can be 
simultaneously diagonalized by conjugation by some U k . 

Using {Mf} iy k U {/} as a basis for d x d matrices, we 
can express pab as 

Pab = 4 + akM0 M i® : + am i T ® M 3 

ki Ij 

+ Y J Oi km M^®M]] (25) 

kilj 
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with generally complex coefficients a k uj ■ Using the com- 
mutivity of each {M k }i, 

pab = i I 1 ® 1 +J2( UkEkU t)® 1 +J2 I ®( UiFiU h 



■Y J { U k®U l )D kl {U k ®U i y 



(26) 



for some diagonal matrices E k , F kl and D k i where each 
E k , F k is d x d, and each D k i is d 2 x d 2 . Then, 



PA- 



1 



1 



where the last inequality follows from S{v\\p) > 
21^2 (Tr|^ - ^i|) 2 [2J,|23. In the above, p fei and q kl are 
probability distributions of the outcomes when locally 
measuring p and pa <8> Pb along the simultaneously ten- 
sor product eigenbasis of {M*}< and {M t -}i< In each of 
these measurements, only the W terms contribute due 
to the orthonormality of the basis chosen. According to 
Eq. i|21[l . the relative entropy between p k i and q k i is a 
lower bound for I c . Thus, 



and 



pAB-pA®pB = ^^{U k <®Ui){D kl -E k ®F k ){U k ®U l ) ] 



k-l 



Tr | pab ~ pa® Pb\ < 7^2 H Tr I Dkl - E k®F k 



kl 



Tr\p AB -p A ®p B \ < (rf+l) 2 v /21n2/ c ( j0 ) (27) 



This completes the proofs for all the steps and thus our 
theorem. 



